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The invention relates to a user interaction system, comprising: 
an electrical apparatus; 

a portable pointing device operable by a user for pointing to a region in space; 
a camera taking a picture; and 
.5 a digital signal processor, capable of receiving and processing the picture, and 

capable of transmitting user interface information derived from the picture to the electrical 
apparatus. 

The invention also relates to a pointing device for use in a user interaction 

system. 

10 The invention also relates to an electrical apparatus for use in the above 

mentioned user interaction system. 



Such a system is known from US-A-2001/0030668. This document describes a 
15 universal remote control system in which a user marks a location on a display with a pointing 
device displaying a recognizable characteristic, e.g. a laser pointer emitting a red spot, and a 
camera takes a picture of the display and calculates to which location the user has pointed. 
An electrical apparatus, e.g. a computer or a television, then performs an action 
corresponding to the marking by the user. E.g. if an icon on the display is marked by flashing 
20 the laser pointer twice, a corresponding computer program is started. 

It is a disadvantage of the known system that the system is tailored for a 
particular display. E.g. the display may be a projection from an LCD based front projector on 
a screen, to which projector a laptop is connected. The camera in the known system has such 
a position and viewing angle that it is capable of imaging the screen. Next, the DSP is trained 
25 to discriminate an area in pictures from the camera, which corresponds to the location of the 
screen. In the prior art this can be done in a simple way. First a picture of the screen is 
captured both with and without illumination with a picture from the projector. By comparing 
the difference of those two pictures, the area of the display in space as viewed by the camera 
is obtained. Second a perspective transformation is derived which determines the relation 
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between a location pointed to on the display and a location in the picture for projection, in the 
frame buffer of the laptop. The known document also describes pointing to a virtual display, 
e.g. a wall, by first manually selecting the corners of the virtual display. However the system 
is always calibrated for one display and in case a user wants to interact with another display, 
5 he has to set up the system again, or use a second system. 

It is a first object of the invention to provide a user interaction system of the 
kind described in the opening paragraph in which it is relatively easy to interact with a 

1 0 number of available electrical apparatuses, more particularly displays. 

It is a second object of the invention to provide a pointing device of the kind 
described in the opening paragraph which makes interaction with a number of available 
electrical apparatuses relatively easy. 

It is a third object of the invention to provide electrical apparatus for use in the 

1 5 above mentioned user interaction system. 

The first object is realized in that the camera is connected to the pointing 
device so that in operation it images the region pointed to. A fixed camera only images a 
certain region of space. It is possible that in the known user interaction system, the user's 
pointing device, or the object pointed to are not even in the field of view of the fixed camera. 

20 It is a major advantage of a pointing device with an attached camera, e.g. on the front end of 
the pointing device closest to the region pointed to, that the camera is always imaging the 
region pointed to substantially optimally. The object pointed to is in general the electrical 
apparatus to which the user interface information, e.g. apparatus control data, is sent, but can 
be any object present in the room in which the user resides, as long as it can be recognized by 

25 the DSP. The apparatus to be controlled is then identified e.g. by pointing to it previously. 

In an embodiment according to claim 2 the user interface information 
comprises apparatus control data for controlling operation of the electrical apparatus. There 
can be many apparatuses present in a living room which e.g. produce sound. Each of these 
devices has a control for increasing or decreasing the sound volume. By pointing the camera 

30 of the pointing device towards a particular apparatus and performing a "volume up*' 

command, e.g. by pressing a button on the pointing device or making a specific movement 
with the pointing device, the sound volume of the device pointed to will increase. Using a 
known universal remote control can have as a disadvantage e.g. that an apparatus responds 
which was not intended by the user, because it uses the same remote control code. 
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It is advantageous if as in claim 3 the digital signal processor comprises an 
object characterizing means for characterizing ah object or part of the object present in the 
picture of the region imaged by the camera, by providing first object characterizing features 
to a comprised object identification means for identifying the object, and which object 

5 identification means is capable of outputting object identification data from which the user 
interface information is constructed. 

In case different electrical apparatuses are to be controlled, they all have to be 
recognized automatically. The DSP can be e.g. a generic processor running algorithms for 
characterizing the apparatus, or it can be an asic. Typically the object characterizing means 

10 will perform image-processing algorithms on a picture of the apparatus. E.g. the relative 

dimensions of the apparatus can be calculated and the number and positions of buttons on the 
apparatus can be analyzed. The object identification means associate with these resulting first 
object characterizing features a specific apparatus, e.g. from information in an apparatus 
table. Other information about the apparatus, e.g. "first television produced by company X 

1 5 with serial number Y" may be present, like the apparatus control data that the apparatus 
understands. 

It is also advantageous if as in claim 4 the digital signal processor comprises: 
motion trajectory estimation means for estimating a motion trajectory of the 
pointing device and outputting a first motion characterizing signature, a signature being a 
20 mathematical abstraction of the motion trajectory; and 

signature identification means for identifying the first motion characterizing 
signature and outputting command identification data, which represents a user interaction 
command, corresponding with the first motion characterizing signature, from which 
command identification data the user interface information is constructed. 
25 To avoid the need of a large number of buttons on the pointing device for all 

the different commands that can be sent to all the different apparatuses, and to make the user 
interaction system more user friendly, it is advantageous if at least some of the apparatus 
control data is generated on the basis of movements of the pointing device by the user, which 
pointing device is typically in his hand. The signature of an upwards movement could mean 
30 "volume up", but in contrast with a standard remote control, the amount of volume change 
can according to the invention depend on the brusqueness of the motion. If the user moves 
his arm up slowly, the volume should e.g. change only slightly, and in contrast if he moves 
his arm up fast, the volume should go up a lot. Determination of the motion of the pointing 
device can be done on the basis of motion sensing means, e.g. a mass on a deformation 
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sensor, a gyroscope, differential GPS, etc. However since a camera is already present, the 
motion can also be determined by imaging successive pictures and applying a motion 
estimation algorithm. Since global motion estimation is simpler than precise motion 
estimation of multiple objects, the camera can be a simple CMOS based camera or even a 

5 two-dimensional array of a small number of light sensitive elements. 

It is also advantageous if as in claim 5 the digital signal processor comprises 
identification improvement means, which are capable of further improving a probability that 
the object represented as object identification data, and user interaction command represented 
as command identification data, are more reliably identified based on predetermined rules, 

10 yielding more reliable user interface information. Since identification of the object and 
motion signature need not always be perfect, given the variability of conditions in a room, 
identification improvement means can be incorporated which e.g. apply an artificial 
intelligence set of rules on the identified object identification data and command 
identification data, possibly also taking into account the characterizing features first object 

15 characterizing features and first motion characterizing signature. E.g., a rule for determining 
the apparatus pointed to can take into account the room in which the pointing device is 
present, or the time of the day. E.g. the first thing what a user may typically do if he comes 
home and picks up the pointing device is to switch on the lights. His intention can be verified 
by the DSP on the basis of a picture of these lights. 

20 It is further advantageous if as in claim 6 the predetermined rules comprise 

probabilistic calculation of the likelihood of an {object identification data, command 
identification data}- pair, talcing into account at least one of the following a priori known 
information units {room in which the pointing device resides, previous command issued by 
user, statistical frequency that a user issues a particular command and time of the day}. E.g. 

25 the likeliness of a command can be calculated based upon a statistical profile of which 
commands a user usually uses, and based upon previously issued commands. Also, the 
likelihood of an apparatus can be updated based on the room in which the pointing device is 
present. On Friday at eight o'clock the user might want to watch his favorite television 
program. 

30 It is also advantageous if as in claim 7 the digital signal processor comprises 

object association means for providing to the object identification means object association 
data comprising at least one of the data entities being: associated object characterizing 
features and object related data- , 
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the object association data being derivable from object template data in object 
memory originating from at least one of the methods: 

the object template data is obtained from object training means performing a 
predetermined calculation on second object characterizing features outputted by object 
characterizing means; and 

the object template data is derived from inputted object data. 

The user interaction system is much more versatile if, instead of being 
preconstructed by the manufacturer for controlling a fixed number of apparatuses, the user 
can adapt it to his changing environment. E.g. if he buys a new apparatus, he would like this 
apparatus to be controllable by means of the same system. Also if he buys a calendar, hangs 
it on the wall, and points to it with his pointing device, he would like an action to occur, e.g. 
starting up a calendar tool on his p.c. For this object association means are required. The user 
images a new apparatus, e.g. under a few different angles. A number of image processing 
algorithms running on the object characterizing means extract features for this object which 
are stored as an object template, possibly after first calculating invariant apparatus specific 
features. He then selects with a selection user interface which apparatus corresponds to this 
object template. This is a so-called training phase of the system. When later the system is 
used in a so called command phase, for sending apparatus control data to an identified 
apparatus, the object identification by the object identification means can be aided by means 
of the data stored in the object association means. E.g. on the basis of the object template data 
a number of associated object characterizing features, corresponding to what the object 
characterizing means output as first object characterizing features, can be sent to the object 
identification means, which then performs a comparison, selecting the correct identified 
apparatus. These features can also be preloaded in the object identification means. The object 
association means can also send object related data to the object identification means, 
comprising at least an identifier of the apparatus, and possibly further data such as supported 
commands. In an alternative realization, the object association means could also perform the 
object identification itself, in which case only object related data is sent to the object 
identification means. To avoid that the user has to input a lot of information through the 
selection user interface, object related data can also be inputted through an object data input, 
e.g. directly from an apparatus to be identified and controlled. Via this input, characteristic 
object template data can also be inputted, e.g. the manufacturer of an apparatus might provide 
a picture taken under well-controlled imaging conditions. 
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It is also advantageous if as in claim 8 the digital signal processor comprises 
signature association means for providing to the signature identification means signature 
association data- comprising at least one of the data entities being: associated signature 
features and command related data 

5 the signature association data being derivable from signature template data in 

signature memory originating from at least one of the methods: 

the signature template data is obtained from signature training means 
performing a predetermined calculation on a second motion characterizing signature 
outputted by the motion trajectory estimating means; and 

10 - the command template data is derived from inputted command data. 

According to a similar rationale, signature association means can be comprised 
to enable the training of new motion trajectories applied to the pointing device by the user 
and characterized as signatures. In these realizations command data is inputted e.g. by an 
apparatus which provides a menu of its supported commands. 

15 It is also advantageous if as in claim 9 the first motion characterizing signature 

is derived on the basis of successive pictures imaged by the camera at respective instances of 
time. Since a camera is attached to the pointing device, imaging its every move, and global 
motion can be estimated by a simple motion estimation algorithm, successively imaged 
pictures can be used for deriving the intended user interface command. 

20 The second object is realized in that the pointing device according to claim 10 

comprises a camera and is capable of sending a picture to a DSP. As already described above 
the user interaction system becomes more versatile with such a pointing device. 

An embodiment of the pointing device is characterized in claim 1 1 in that it is 
capable of sending a picture to the digital signal processor, which is capable of sending user 

25 interface information to an electrical apparatus based on the picture 

In an embodiment according to claim 12, the DSP is comprised in the pointing 
device. The pointing device can then also operate separate from the user interaction system. 
E.g. the pointing device can be taken to a supermarket, process imaged pictures and store the 
corresponding first object characterizing features. Control actions by home apparatuses can 

30 then be performed at a later time. 

It is advantageous if as in claim 13 the pointing device comprises motion 
sensing means for sensing a motion trajectory of the pointing device. Irrespective of whether 
the device is used for recognizing objects, it can be used to send apparatus control data 
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corresponding to specific movements by the user. The intended apparatus in such an 
application of the pointing device could e.g. be fixed or indicated with a button. 

It is advantageous if as in claim 14 the pointing device comprises a 
characteristic projector for optically projecting a characteristic pattern towards a region 
5 pointed to. It is advantageous if the user gets a feedback of which apparatus or part of an 
apparatus he is pointing to. Otherwise, if he is not aiming correctly, he might e.g. virtually 
push the wrong button of the apparatus. One form of feedback could be a small picture on a 
display of the pointing device which shows the center of the region imaged by the camera. 
An apparatus or button on an apparatus in the middle of the display is the one intended. 

10 Another form of feedback comprises projection of a characteristic pattern, e.g. a white cross, 
or a laser dot, which illuminates the intended part of an apparatus. More complex patterns 
such as a two-dimensional grid of lines can aid the DSP in identifying the geometry of the 
imaged apparatus. 

It is advantageous if as in claim 15 the pointing device comprises a 

15 programmable user interface code generator and a wireless transmitter for transmitting the 
code to the electrical apparatus. Instead of the DSP sending apparatus control data to the 
apparatus, e.g. by means of a wireless home network based on bluetooth, the pointing device 
itself may directly send the appropriate remote control command, e.g. by means of infrared 
radiation. The advantage compared to a classical universal remote control is that the DSP 

20 automatically identifies the apparatus, from a camera picture, and generates the correct code. 

It is advantageous if as in claim 16 feedback means are present for feedback of 
user interface information. E.g. a display is useful. Another feedback can provide some kind 
of force feedback, e.g. by means of a mass which is moved. This can e.g. simulate tipping 
against the screen of a TV. 

25 The third object is realized in that interface means are comprised which allow 

the apparatus to send information about supported commands to a pointing device as claimed 
in claim 1, based on an "identify supported commands" call of the pointing device to the 
apparatus. The DSP or part of the functionality of the DSP might also be comprised in the 
apparatus. The apparatus might also transmit object characterizing features and so on, 

30 

The invention will be apparent from and elucidated with reference to the 
following description and the accompanying drawing, showing schematically examples of 
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components of the user interaction system and pointing device according to the invention. In 
this drawing: 

Fig. 1 schematically shows a room in which a user interaction system 
according to the invention is present; 
5 Fig. 2 schematically shows a digital signal processor according to the 

invention; 

Fig. 3 schematically shows a pointing device according to the invention; and 
Fig. 4 shows two examples of amotion trajectory and corresponding signature. 

10 

In Fig. 1 a room is shown containing various electrical apparatuses which can 
receive user interface information I, e.g. electrical apparatus 110 being a plasma display, but 
also a personal video recorder or a telephone or a washing machine can be present, with 
which a user 100 can interact. The room also contains various objects, being electrical 

15 apparatuses or just any physical object like e.g. a window, or object 130 being an audio 
apparatus, vase 170 and electrical apparatus 110, that can be pointed to by means of a 
pointing device 101. 

According to the invention, the pointing device 101 contains a camera 102, 
and can send pictures of regions of a room or objects in those regions to a digital signal 

20 processor (DSP) 120, which can identify the regions or objects on the basis of one or more 
pictures imaged by the camera 102. The camera is connected to the pointing device 101 in 
such a way, that it images well the region pointed to. E.g. it can typically reside at the far end 
of the pointing device 101, but it could also be mounted on the side under an angle. The user 
100 has the freedom to point to whatever object he wants, and in such a way a very user- 

25 friendly and powerful user interaction system can be realized. 

The DSP 120 can be e.g. a home computer controlling apparatuses in a home 
network, or it can be incorporated in the pointing device 101. The camera 102 can be a CCD 
camera or CMOS camera. The connection between the camera 102 and the DSP 120 can be 
wireless if the two are separate. The pointing device 101 can already perform part of the 

30 algorithms to be performed by the DSP 120 for identification of an object, so that e.g. only a 
small set of features need to be sent to the external DSP 120 for further processing. La fact the 
functionally shown DSP 200 can be distributed in a number of parts in the pointing device 
101, room, different apparatuses and so on. 



WO 2004/04701 1 PCT/IB2003/004830 

9 

The DSP 120 is designed to send user interface information I, e.g. apparatus 
control data ac, to an identified apparatus. E.g. user 100 can point the pointing device 101 to 
light 160 and push an on-button on the pointing device 101, which results in the DSP 120 
sending an on-command to the identified light 160. The object identified needs not be the 

5 apparatus to be controlled itself. E.g. pointing at vase 170 may start a particular program on a 
p.c, the output of which is shown on the display apparatus 110. Parts of an object may be 
pointed to, e.g. pointing at a volume button 134 of object 130 increases or decreases the 
volume of this audio apparatus, in dependence on an additional action such as the pushing of 
a button or a movement of the pointing device 101 . Pointing to the channel selection display 

10 132 and making a sideward movement may change the channel. The DSP 120 may also 

recognize the status of a part of an object, e.g. that cassette loader 136 is left open, and issue 
a "close" command. Other specific situations can also be detected or monitored. User 100 
may also point to a piece of furniture 140 or a part 142 thereof. If the piece of furniture 140 is 
a cupboard different actions can result from pointing to different drawers. If the piece of 

15 furniture 140 is an intelligent refrigerator, pointing to it can result in the user defined action 
of ordering beer over the Internet. Apart from apparatus control data ac, other user interface 
information I can be sent, e.g. picture data pd. E.g. first electronic painting 150 and second 
electronic painting 152 display a variable picture. They can be realized with electrophoretic 
electronic ink displays. User 100 can capture the picture displayed on the first electronic 

20 painting, or even a picture on a calendar hanging on the wall, and transfer it as picture data pd 
to the second electronic painting 152, possibly after some image processing such as e.g. 
perspective correction. 

Fig. 2 schematically shows functional components in a digital signal processor 
200- an embodiment of 120 in Fig. 1. Not all components described are necessarily present in 

25 a particular embodiment of the user interaction system. The DSP 200 can be realized as a 
generic processor, an asic, or separate components on a circuit board. The DSP 200 may 
receive camera input c and sensor input s, e.g. from motion sensing means 304. The DSP 200 
may comprise an object characterizing unit 202 for characterizing a picture of an object 130 
or part of the object 130 as imaged by the camera 102. The object characterizing unit 202 is 

30 capable of providing first object characterizing features fl to an object identification unit 204. 
The first object characterizing features fl may be of different kinds depending on the image 
processing algorithms embodied in the object identification unit 204. Typically the object is 
first isolated from the remaining parts of the imaged picture. This can be done e.g. on the 
basis of edge and curve detection on the picture. Another option is 3D analysis of the imaged 
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region, e.g. with a projected grid or a second camera. Part of an object may be detected on the 
basis of correlation techniques, and then the rest of the object can then be verified. 

Second the first object characterizing features fl are calculated. E.g. a contour 
of the object and its composing parts can be characterized by means of a curvature versus 

5 angle characteristic. Or the ratio width/height may be calculated. Also, the isolated object 
region itself, or a transformation of it may be used, typically after first performing a 
perspective correction. 

The object identification unit 204 then identifies the object on the basis of the 
first object characterizing features fl received from the object characterizing unit 202. E.g. 

10 any known classification technique in multidimensional feature space might be used. The 
output of the object identification unit 204 is object identification data oi, which can be a 
simple number or a data structure containing additional information about the identified 
object. The characterizing features and further data of a number of apparatuses might be e.g. 
stored a priori in an object characteristics memory 230 by the manufacturer of the pointing 

15 device containing the DSP 200, or might be preloaded from internet if the DSP 200 is a 

computer. Instead of identifying an object, the DSP may also identify a region of the room. 
E.g. pointing the pointing device 101 towards the ceiling may issue the command "lights on", 
whereas pointing to the floor switches the lights off. Or pointing to one of the corners may 
invoke an associated command. The corners may be identified on the basis of objects present, 

20 e.g. on the left side a plant and on the right side a cupboard. 

The user interaction system is much more useful however if a user can train 
new objects, such as vase 170, himself. Therefore an object association unit 212 might be 
comprised, which provides the object identification unit 204 with object association data oad. 
The object association data oad might comprise characterizing features fa of an object, 

25 similar to the first object characterizing features fl, based on an object template. The object 
template can be obtained by applying a training phase. A new object is imaged by the camera 
102 under a number of conditions, e.g. different angles, lighting etc. The object 
characterizing unit 202 generates second object characterizing features £2 for all the pictures. 
An object training unit 221 in the object association unit 212 generates an object template 

30 corresponding to the object which is stored in object memory 220. The template can e.g. be 
the cluster of second object characterizing features f2 of all the pictures, or average features, 
or some invariant feature calculated on the basis of the variable second object characterizing 
features £2. The object template might also be derived by the object training unit 221 on the 
basis of characterizing features coining in from the outside as object data od. This object data 
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might originate from an apparatus which inputs e.g. pictures of it taken from different angles 
by a manufacturer. The object data od and object association data oad might also comprise 
object related data id, i.e. all kinds of information concerning an apparatus, like e.g. 
supported commands. 

To facilitate object recognition, certain objects like e.g. apparatus 110 may 
display a characteristic pattern 116, e.g. they may flash a red cross a predetermined number 
of times with different on/off intervals, in response to an "object identify" call of the pointing 
device. The apparatus 110 might also comprise an interface unit 190, which makes it more 
useful in a user interaction system according to the invention. E.g. an apparatus like an 
internet radio can have a "service discovery" function which communicates to the pointing 
device 101 which commands it supports. It might even transmit infrared remote control 
sequences corresponding with these commands, and so on. This facilitates the training of the 
DSP 120, 200 by the user 100. E.g. in response to the communication, the commands can 
appear as a menu of predefined icons on a display 316 of the pointing device 101. The 
interface unit 190 can also be realized separately. E.g. a washing machine might not have an 
intricate remote control facility. A plug-on box can be provided which receives standard 
commands from the DSP 200 and interfaces in a simple way with the apparatus, e.g. starting 
it. 

The DSP 200 might also comprise components for analyzing gestures made 
with the pointing device 101 by the user 100. For this, the motion trajectory 400 in Fig. 4 of 
the pointing device 101 is used, which is calculated e.g. on the basis of pictures from the 
camera 102. Fig. 4a shows an upward motion trajectory 400 in a three-dimensional 
coordinate system 404. It is summarized by signature 402 being a mathematical 
representation of that upward motion trajectory. The signature 402 can be linked, e.g. by 
means of a command table, with command identification data ci, e.g. in this case the user 100 
might reserve this motion for a "volume up" command. A circular motion trajectory 410 and 
circular signature 412 might mean to one user "rewind", whereas another user has trained the 
DSP 200 to relate this circular signature 412 with a "device on" command. The user can e.g. 
roughly orient his pointing device 101 towards light 160 and make the "device on" motion, 
on the basis of which the DSP 200 switches the light 160 on. Alternatively the user 100 can 
also make the "device on" motion somewhere in the air and then point to light 160, or a 
television or whatever object he wants to be switched on. The signatures 402, 412 can be 
parameterized in any way and matched by any known curve fitting technique. E.g. the 
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signature 402 can be coded as a number of connected line segments of specific length and 
intersegment angle. 

A motion trajectory estimation unit 206 calculates a motion trajectory 400, on 
the basis of motion sensing means 304, or successive pictures from the camera 102. E.g. 

5 optical flow or block based motion estimation techniques can be applied on two successive 
pictures to obtain a part 401 of motion trajectory 400. The divergence of the motion vector 
field can be used to estimate motion towards on object, i.e. typically along an axis of the 
pointing device 101. On the basis of motion trajectory 400, a signature generation unit 209 
outputs a first motion characterizing signature si. A signature identification unit 208 

10 identifies the first motion characterizing signature si and link it with command identification 
data ci, being e.g. a numerical representation of a user interaction command intended by the 
user 100. The first motion characterizing signature si can be compared with signature 
characteristics for different stored signatures in a signature characteristics memory 232. E.g. 
if a roughly circular or even any returning movement is made, a stored circular signature 

15 template will yield a better match than a linear signature template. Geometrical and structural 
curve matching techniques can be applied in the similarity calculation. 

Just like it is advantageous to have a DSP 200 which is able to learn to identify 
new objects, it is advantageous if the DSP 200 can identify new movements preferred by the 
user 100. Therefore the signature association unit 214 might be comprised for providing 

20 signature association data SAD to the signature identification unit 208. The signature 

association data SAD may comprise associated signature features sa, e.g. a mathematical 
representation of the circular signature 412, and/or command related data cid, e.g. received 
from an apparatus sending its supported commands as command data cd to the DSP 200. 
Signature templates in a signature memory 222 can be obtained from a signature training unit 

25 223 on the basis of a first motion characterizing signature s2 from the motion trajectory 
estimation unit 206 or from command data cd. 

Since the environment of the user is very variable, e.g. the lighting changes 
during the day, the identification of the object 130 or a command, i.e. a signature 402, might 
sometimes be incorrect. Therefore the DSP 200 might comprise an identification 

30 improvement unit 210. The intended command is e.g. dependent on the object 130 pointed to, 
or might be dependent on the command given previously, and so on. An object is identified 
incorrectly if it is not present in the room the pointing device is residing in. Therefore room 
recognizing means 185 may be present, e.g. flashing LEDs. The user 100 can scan the room 
recognizing means 185 with the pointing device 101 when he enters the room. Apart from the 
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camera 102, a photodiode might be comprised in the pointing device 101 for detecting the 
room recognizing means 185. Room localization beacons 180, 181, 182 may also be present, 
so that the DSP 200 can recognize to which part of the room the pointing device 101 is 
pointing. The beacons could be light emitting in case the camera 102 is to recognize them, 
5 but they could also e.g. emit electromagnetic radiation. Objects present in the room can also 
aid in the recognition of an object. E.g. if vase 170 is next to object 130, its recognition can 
aid in the recognition of object 130. Even analysis of the past trajectory of the pointing 
device can be used. If the pointing device was previously pointing to object 130 and it has 
been moved to the right, it should be pointing to apparatus 110. Taking all this information 

10 into account, the identification improvement unit 210 can calculate e.g. Bayesian 

probabilities or use fuzzy logic to arrive at a more certain identification of the object 130 and 
the intended command. 

The user interaction information I outputted by the DSP 200 typically 
comprises control apparatus data ac, being e.g. a structure comprising the object 

1 5 identification data oi and the command identification data ci, or a new identifier based on 
those, identifying an intended command for a particular apparatus 110. The user interaction 
information I may also comprise picture data pd. 

Fig 3. schematically shows a pointing device 300- an embodiment of 101 in 
Fig. 1- according to the invention. Not all features shown need be present: a simple pointing 

20 device may only comprise a camera 302 and e.g. a display 3 16 for feedback. A number of 
buttons e.g. button 308 might be present. This allows for giving certain commands only be 
pushing a button. Also similar movements of the pointing device 300 can be used for two 
different commands, once with and once without simultaneously pushing the button 308. A 
characteristic projector 320 may be present, e.g. for projecting a laser spot towards the 

25 location pointed to. The camera of the pointing device is capable of sending a picture to the 
external DSP 120, but it might also itself comprise the DSP 200. In this way the pointing 
device can also be used outdoors. A pointing device 103 not carried by the user 100 might 
also be used to monitor an apparatus 1 10 or room. Apart from sensing the motion by means 
of the camera, other motion sensing means 304 may also be comprised, e.g. a gyroscope or 

30 differential GPS. A programmable code generator 309 and wireless transmitter 310 may also 
be present for sending a command directly to an apparatus 110. Feedback means might be 
present for providing the user with additional information. E.g. a light 312, a sound 
production device 314, a force feedback means 306, and in particular a display 316 might be 
comprised. The force feedback means can comprise a small moveable mass, which can 
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simulate vibration or the touching of an apparatus. The display can show e.g. the center of the 
region pointed to, possibly superimposing a graphically generated cross in the center of the 
picture imaged. This can also be displayed on a television. Similarly the light 160 might be 
used to provide feedback instead of light 312. The feedback on the display 316 might show 

5 e.g. the motion trajectory 400 and how similar it was to previous motion trajectories during 
the motion trajectory training phase, aiding the user 100 to learn the DSP 200 to recognize a 
stable gesture. The display 316 might also indicate what the DSP 200 sees by showing a 
picture after a number of image processing steps, e.g. a picture containing detected edges, 
and comprising text naming the identified device. 

10 User identification means 330 might also be present, e.g. a fingerprint sensor, 

or an input for a password or personal smart card. In this way the DSP 200 can set itself to a 
mode of operation particular to the user 100. The user might even give different commands 
depending on the room he is in. Speech recognition means 340 may also be comprised, to 
supplement the commands given by movements of the pointing device 300. Games could be 

15 played by using the pointing device 101 as some kind of interface device to the real world. 



